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ABSTRACT 


This research connects several data-driven educational data 
mining approaches to a framework for interaction developed 
in educational research. In particular, 10 million usage data 
points collected by a Learning Management System used by 
students and teachers in 450 online undergraduate courses 
were analyzed with this framework. A range of educational 
data mining techniques were employed, including K-means 
clustering, multiple regression, and classification, to both 
explore and predict student final grades and course com- 
pletion rates. Findings show that support for the overall 
model varied with the way data were mapped to the frame- 
work (e.g., static vs. temporal features) and the analysis 
technique used (with clustering and classification providing 
more useful insights). 


Keywords 
Learning Management System, Interactions in Online Learn- 
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1. INTRODUCTION 


Educational data mining (EDM) studies have typically re- 
lied upon data-driven techniques in order to extract use- 
ful patterns and information from large-scale educational 
datasets [11]. While these data-driven approaches have pro- 
vided important contributions, some have argued that their 
inherent a-theoretic nature may fall short in terms of provid- 
ing insight into the development of educational theory and 
practice [6]. As such, more studies are needed that better 
connect EDM findings to educational theory, research, and 
practice. 


To address this need, this paper integrates a theory-driven 
approach with a data-driven approach to explore student 
learning outcomes, activities, and patterns as they interact 
with course content using a popular Learning Management 
System (LMS), called Canvas. Specifically, for the theory- 
driven approach, we apply an interaction framework [2] to 
explore how patterns in the LMS data are related to student 


final grades and course completion rates at a course level — 
a macro-perspective. Here, we use K-means clustering and 
multiple regression analysis. For the data-driven approach, 
we build classifiers based on machine learning algorithms to 
predict a student’s final grade and whether a student will 
complete a course or not, providing a micro-perspective. 


In particular, we conducted three tasks by addressing follow- 
ing research questions: 1) How many clusters of courses are 
found based on users’ interaction patterns? Are there rela- 
tionships between individual interaction clusters and course 
features (size, content, level)? 2) Do the interaction patterns 
significantly predict student final grades and course comple- 
tion rates? 3) Can we build effective classifiers to predict an 
individual student’s final grade and whether each student 
will complete a course? Are the pre-built classifiers still ro- 
bust and effective for the next semester’s data? How many 
weeks in a semester are needed to discover low performing 
students or non-course completers (i.e., who may drop out 
a course)? 


2. BACKGROUND 


2.1 Interaction in Online Learning 

Interaction has long been a significant research topic in the 
field of educational technology. Nonetheless, it remains a 
hard concept to define, as it is multifaceted and complex [1, 
7]. Some researchers have taken a more restrictive view by 
excluding non-human factors, and focusing only on human 
interactions [5]. However, others argued that both human 
and non-human interactions are integral aspects of the ed- 
ucational experience [1, 2, 4]. Further, supporting various 
combinations of interaction among teacher, student and the 
content can help foster a community of inquiry in online 
learning [4]. 


In particular, Moore [7] categorized interaction into three 
types: (i) learner-content interaction, (ii) learner-instructor 
interaction and (iii) learner-learner interaction. Anderson 
and Garrison [2] expanded Moore’s categorization by differ- 
entiating between teacher-content and student-content in- 
teraction. In their final model, teacher-content (TC) inter- 
action refers to teachers creating content and learning activ- 
ities. Student-content (SC) interaction refers to students’ 
interactions with various forms of educational content in- 
cluding reading texts, completing assignments, and working 
on projects. Student-teacher (ST) interaction includes both 
asynchronous and synchronous communication between stu- 
dents and teachers. Finally, student-student (SS) interaction 
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Table 1: Characteristics of 450 courses. 


Course characteristics |Courses| | Percent 
STEM 116 25.87% 
STEM Nowa lEM, [lon STEM 334 TEI 
Small (<21) 107 23.8% 
Course size Med (<51) 210 46.7% 
Large (51+) 133 29.5% 
1000 level 156 34.7% 
Coirseiievel 2000 level 79 17.5% 
3000 Teve 15 34.97% 
4000 level 58 12.9% 


refers to interaction between individual students. 


There have been several empirical studies investigating the 
relationships between different types of interaction and stu- 
dent learning. For example, Bernard et al. [3] conducted 
a meta-analysis on the effects of the three types of interac- 
tions (i.e., SC, ST and SS) on student performance in online 
learning. They found that the effects of SS interaction and 
SC interaction were significantly larger than the effect of ST 
interaction in terms of student performance. 


In this paper, we use this interaction framework to explore 
how interaction is related to student performance and course 
completion rates in online courses by analyzing and explor- 
ing LMS interaction data. 


2.2 Educational Data Mining in Learning Man- 


agement Systems 

A LMS provides a wide range of features to support inter- 
actions between students, teachers, and content [9]. More- 
over, the LMS typically captures interactions with these fea- 
tures in various formats and at diverse granularity levels. 
The most widely used methods in EDM studies using LMS 
data are prediction, clustering, and distillation for human 
judgment (visualization) [10]. Prior studies have found that 
usage variables related to SS interaction (i.e., the number 
of discussion messages posted) and SC interaction (i.e., the 
number of completed assignments) were significant predic- 
tors of student performance [6, 12]. 


However, prior studies using LMS data analyzed student- 
level data, rather than looking at the various levels and kinds 
of interactions between teachers, students, and contents. In 
this paper, we used course level data as well as individ- 
ual student level data to provide both macro- and micro- 
perspectives on interactions between students, teacher, and 
contents in online learning. In this way, our research com- 
plements the existing research base. 


3. DATASET AND METHODS 
3.1 Dataset 


For the present study, data were extracted from the Canvas 
LMS deployed at a mid-sized public university located in the 
western U.S. The LMS automatically captures all teacher 
and student online interactions. Note that an academic sup- 
port unit at the university extracted and anonymized these 
data, and Institutional Review Board (IRB) approved using 
the data for research purposes. 


We conducted data preprocessing by transforming raw data 
into an appropriate shape for analysis. First, we performed 


data cleaning in the following three steps: 1) selected courses 
offered between Fall 2014 and Spring 2015; 2) selected only 
online undergraduate courses; and 3) excluded low enroll- 
ment courses (i.e., the number of enrolled students is less 
than 5). After conducting the data cleaning process, our 
dataset consisted of 450 courses including 10,576,718 inter- 
actions, and anonymized 21,171 student profiles (8,844 dis- 
tinct student profiles) and 450 teacher profiles (228 distinct 
teacher profiles). 


Table 1 shows the number of courses in our dataset, catego- 
rized by STEM vs. non-STEM, size, and course level. 25.8% 
courses are Science, Technology, Engineering, and Mathe- 
matic (STEM) courses. A full range of course sizes is rep- 
resented and is centered around medium-sized enrollments 
(i-e., 21-50 students). The largest number of courses is 1000 
level (34.7%) and 3000 level (34.9%) courses. 


3.2 Data Mining Methods and Features 


In this study, we used three data mining methods for three 
tasks — one method for each task: (i) K-means clustering 
to find groups of courses each of which has similar inter- 
action patterns at a course level; (ii) multiple regression 
to measure the relationship between each interaction fea- 
ture/variable and average student final grade and course 
completion rates at a course level; and (iii) classification al- 
gorithms to predict each student’s final grade and whether 
the student will complete a course or not. The first two 
methods provided a macro perspective focusing on courses, 
while the last method provided a micro perspective focusing 
on individual students. 


Task 1. We used K-means clustering to identify how on- 
line courses were clustered based on interaction patterns. 
We used the PROC FASTCLUS method in SAS, as miss- 
ing values were replaced with an adjusted distance using 
the non-missing values [8]. We used Euclidean distance to 
measure distance between each node (i.e., a course) and a 
centroid. To find the optimal kK, we examined the agglomer- 
ation schedule to determine the optimal number of clusters. 


Task 2. We conducted multiple regressions using SAS to 
test whether each interaction type significantly predicted 
outcome variables — average final grades and course com- 
pletion rates. 


For Tasks 1 and 2, we grouped Canvas features (variables) 
into four categories (TC, SC, SS, ST) based on Anderson 
and Garrison’s interaction framework [2]. Table 2 presents 
four categories associated with the Canvas features, and each 
feature’s mean, standard deviation (SD) and minimum and 
maximum values obtained from the 450 courses. 


Task 3. We applied classification algorithms (i.e., SVM, 
Random Forest, J48 and AdaBoost) to predict each stu- 
dent’s final grade and whether the student will complete a 
course or not. Effectiveness of classifiers depends on quality 
of features. For this task, we used 129 features consisting of 
52 static features and 77 temporal features as shown in Ta- 
ble 3. These features consisted of not only the main interac- 
tion features that we used in the first and second tasks (while 
they were average values in the first and second tasks, indi- 
vidual student feature values were used in the third task), 
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Custer 1x20 3+ 


Avg # of attachments posted by a teacher (TC) 


Avg # of discussions participated by a teacher (ST) 


(a) ST interaction vs. TC interaction (z- 
transformed data). 


Guster tx 20 3+ 
8-4 


Avg # of attachments viewed by a student (SC) 
- 
1 


Avg # of discussions participated by a studeni(SS) 
f 


(b) SS interaction vs. SC interaction (z- 
transformed data). 


Figure 1: Scatter plots showing how courses in clusters are distributed differently. 


Table 2: Descriptive statistics of 450 courses analyzed by 12 interaction features associated with four cate- 


gories. 
Category Features Mean SD Min-Max 
Avg. # of attachments posted by a teacher (tc_atta) 15.97 22.86 | 0-176 
vg. of discussion topics posted by a teacher (tc_disc 18.55 15.54 | 0-107 
Teacher-Content | Avg. # of wiki topics posted by a teacher (tc_wiki) 13.58 13.96 | 0-74 
Avg. # of quizzes posted by a teacher (tc_quiz) 9.72 9.48 | 0-56 
Avg. # of assignments posted by a teacher (tc_assi) 15.30 12.97 | 0-75 
Avg. # of attachments viewed by a student (sc_atta) 118.19 | 174.57 | 0-1,625 
Avg. # of discussions viewed by a student (sc_disc) 48.05 | 44.88 | 0-296 
Student-Content | Avg. # of wiki viewed by a student (sc_wiki) 54.42 51.92 | 0-387 
Avg. ratio of completed quiz by a student (sc_quiz) 0.88 0.12 | 0.10-1 
Avg. ratio of completed assignments by a student (sc_assi) 0.78 0.16 | 0.10-1 
Student-Student | Avg. # of discussions participated by a student (ss_disc) 12:21 15.13 | 0-101 
Student-Teacher | Avg. # of discussions participated by a teacher (st_disc) 50.15 68.63 | 0-489 


but also additional features (e.g., the number of views of 
the grade and announcement pages, course information and 
temporal features). In particular, temporal features were 
extracted from a series of daily snapshots of each student’s 
interaction record. Given a course and interaction informa- 
tion of a student who took the course, we represented the 
student by using the 129 features. 


4. EXPERIMENTAL RESULTS 


In the previous section, we described our dataset and three 
data mining methods for conducting three tasks. In this 
section, we present results of these experiments using each 
of the methods for each task. 


Table 3: 129 Features extracted from each student 
and each corresponding course. 


Static Features 

Features [Features] 
Course level and Department offering the course 2 
Total # of views and total # of participation by a 2 
student 
7 of views and participation in each of the 24 items 48 
by a student 

Temporal Features 
Features |Features| 
Total # of participated weeks (i.e., we add +1 if a iT 
student did participation at least once in a week) 
Mean and standard deviation of weekly view count 4 
and weekly participation count 
Each week’s view count and participation count 36 
Accumulated weekly view count and accumulated 36 
weekly participation count 


4.1. Task 1: Clustering Courses and Analyz- 


ing Characteristics of Clusters 

In Task 1, our research goal was to cluster courses based on 

interaction patterns and analyze characteristics of the clus- 

ters. First, we standardized the interaction features /variables 
(raw scores) by following the recommendation in the litera- 

ture [8]. The raw scores were z-transformed to a mean of 0 

and standard deviation of 1 for either the course or semester 

level data. 


K-means clustering requires an input K. To make sure we 
chose an optimal K, we examined the agglomeration sched- 
ule. The demarcation point indicated that kK = 3 would 
produce the optimal result. Clusters 1, 2 and 3 contained 41, 
300 and 109 courses, respectively. The root mean squared 
standard deviations (RMSSTD) for each cluster were 1.32, 
0.71, 0.98 respectively, indicating that the courses in cluster 
1 are more widely dispersed than the others. 


We further drew two scatter plots to help understand char- 
acteristics of the three clusters as shown in Figure 1. Fig- 
ure 1(a) represents a scatter plot of ST interaction (st_disc) 
vs. TC (tc_atta) interaction. Courses in cluster 1 had higher 
TC interaction than those in the other clusters, whereas 
courses in cluster 3 had higher ST interaction than the other 
two clusters. Figure 1(b) shows a scatter plot of SS interac- 
tion (ss_disc) vs. SC interaction (sc_atta). Courses in cluster 
1 showed higher student-content interaction than the other 
two clusters. On the contrary, courses in cluster 3 showed 
higher SS interaction than the other two clusters. 


Proceedings of the 9th International Conference on Educational Data Mining 326 


Table 4: Means and standard deviations of clusters. 
*x indicates the highest value among the three clus- 


Table 5: 


courses in three clusters. 


The number of STEM and Non-STEM 


ters. Cluster | Non-STEM STEM Total 
Cluster 1 Cluster 2 Cluster 3 Cl 29 (70.7%) 12 (29.3%) 41 
Feature Asien ord dey i apie C2 Dal (71.0%) 87 (29.0%) 300 
jateraction ivenaction faiecer tion: C3 92 (84.4%) 17 (15.6%) 109 
M SD M SD M SD Total 334 116 450 
tc_atta 2.12 1.78 -0.32 0.44 0.09 0.67 
tc_disc 0.26 0.96 -0.44 0.59 1.1 1.04 
tc_wiki 1.53 1.31 | -0.37 | 0.64 0.43 0.98 Table 6: The number of small, medium, large 
tc_quiz 0.68 1.32 -0.05 0.99 -0.12 0.76 courses in three clusters. 
tc_assi 0.38 1.23 | -0.28 | 0.77 0.62 1.14 Cluster Small Medium Large Total 
T-C 0.99 0.66 | -0.29 | 0.43 | 0.42 0.55 CI 13(31.7%) | 13(31.7%) | 15(36.6%) 41 
peas C2 78(26.0%) | 130(43.3%) | 92(30.7%) | 300 
eee ee C3 16(14.6%) | 67(61.4%) | 26(24.0%) | 109 
sc_disc -0.04 0.52 -0.46 0.55 1.22 1.02 Total 107 210 133 450 
sc_wiki 1.8 1.62 -0.23 0.68 -0.07 0.7 
sc_quiz | -0.18 | 1.04 | 0.02 | 0.92 0.02 1.19 had small, medium and large enrollments. Table 6 shows the 
a a = 1 sag a = analytical results. The result of a chi-squared test showed 
oe : i : ‘ , B ars differences eae the usenet x7(4, N = 
450) = 15.31, p < .05. The cluster 1 had the largest propor- 
mc et 2 CE SE Ee es tion of large courses, whereas the cluster 3 had the small- 
ST = ES wee est proportion of large courses. The findings suggest that 
ny 2.77 | 0.59 | 3.01 | 0.57 | 3.05 0.38 promoting interaction among participants is rarer in large 
rades 
eee 84.04| 12.95| 86.84] 12.75| 88.09! 9.18 PERSE: 
rates 


Next, we examined descriptive statistics for the predictors 
and outcome variables (final grades and completion rates 
for each cluster as shown in Table 4*. The results showed 
that cluster 1, dubbed “Content-Interaction courses”, had 
the highest means for both TC interaction (M = 0.99, SD 
= 0.66) and SC interaction (M = 0.57, SD = 0.85). Cluster 
2, dubbed “Low-Interaction courses”, had the lowest means 
for all interaction variables. Lastly, cluster 3, dubbed “/nter- 
person Interaction”, had higher means for SS interaction (M 
= 1.05, SD = 1.22) and ST interaction (M = 1.07, SD = 
1.33). The analysis revealed that courses in each cluster had 
different course emphases: content interaction in cluster 1, 
non-interaction in cluster 2, and person interaction in cluster 
3. 


Then, we compared the three clusters in terms of average 
student final grades and course completion rates. As shown 
in Table 4, the cluster 3 had the highest mean in student 
final grades (M = 3.05, SD = 0.38) and course completion 
rates (M = 88.09, SD = 9.18) among the three clusters. The 
cluster 1 had the lowest mean in student final grades (M = 
2.77, SD = 0.59) and course completion rates (M = 84.04, 
SD = 12.95). This finding reveals that the positive impact 
of courses focusing on interactions between participants. 


Next, we conducted chi-squared tests to compare STEM and 
Non-STEM courses in the three clusters. As shown in Ta- 
ble 5, the distribution of the STEM and Non-STEM courses 
was significantly different across the three clusters, x7(6, N 
= 450) = 7.80, p < .05. STEM courses were infrequent 
overall, but even more scarce in the cluster 3. 


Then, we analyzed how many courses in the three clusters 


‘The meaning of each feature’s acronym is described in Ta- 
ble 2. 
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Lastly, we examined how many courses in the three clusters 
were at the 1000, 2000, 3000 and 4000 levels. A chi-squared 
test found no significant differences in the distribution of the 
course levels among the clusters, y? (6, N = 450) = 8.79, p 
> .05. 


4.2 Task 2: Prediction Using Multiple Regres- 


sion Analysis 

In task 2, first we conducted a multiple regression analysis 
to examine the influence of interaction features or feature 
category listed in Table 2 in predicting average student final 
grades in each course. Table 7 shows regression results of 
significant variables. The results indicated that the explana- 
tory variables accounted for a modest 15.8% of the variance 
(R? = 0.16, F(12, 411) = 6.41, p < .05). Several signifi- 
cant and negative predictors were found in teacher-content 
interaction. In particular, as tc_disc, tc_wiki, and tc_assi in- 
creased, final grades tended to decrease. Findings in the 
student-content interaction category were the opposite. Fi- 
nal grades tended to increase when sc_quiz and sc_assi in- 
creased and the same is true in the student-teacher interac- 
tion category. 


A second multiple regression analysis was conducted to test 
the influence of each interaction feature or each feature cat- 
egory on course completion rates. The explained variance 
was a modest at 15.7%(R? = 0.16, F(12, 411) = 6.64). Only 
a single teacher-content variable tc_wiki was negatively sig- 
nificant. Student-content interaction features sc_quiz and 
sc_assi were significant and positive again in relation to 
course completion rates. Taken together, these findings sug- 
gest that certain teacher activities related to content were 
less productive, whereas student activities related to con- 
tent were more positively productive in both final grades 
and course completion rates. 
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Table 7: Multiple regression results (* indicates the feature is significant at the 0.05 level, and the table 


includes only significant features). 


final grades completion rates 
Category Feature B SE(B) B p B SE(B) B t p 
Intercept 0.000 0.089 0.000 29.600 | 0.001 | 0.000 0.089 0.000 29.600 | <.0001 
escent tc_disc -0.006 0.003 -0.177* | -2.240 | 0.026 | -0.078 0.060 -0.059 | -0.990 0.324 
Tateraeica tc_wiki | -0.011 0.002 -0.295* | -4.540 | 0.001 | -0.241 0.054 -0.202* | -3.710 0.000 
tc_assi 0.004 0.002 0.106* 1.970 | 0.050 | 0.037 0.048 0.033 0.690 0.490 
Gtadent Content sc_wiki 0.001 0.001 0.141* 2.140 | 0.033 | 0.125 0.015 0.029 1.900 0.058 
Ttewaeiea sc_quiz 0.003 0.001 0.164* 3.250 | 0.001 | 0.284 0.019 0.107* 5.650 | <.0001 
SC_assi 0.003 0.001 0.177 3.530 | 0.001 | 0.115 0.019 0.044 2.290 0.023 
Student-“Teacher | st disc | 0.001 | 0.001 | 0.160* | 2.340 | 0.020 | 0.130 | 0.011 | 0.022 | 1.910 | 0.057 
nteraction 
Hi svu i RF HB sas Hl AdaBoost 
Table 8: Feature Sets = 80 
Feature | Features (# of features) = 
Set Fs 78 
A Course level and department offering the course, * 
total # of views and total # of participation (4) A 2 © 2 c 
B feature set A + # of views and participation in peature Sot 
each of the 24 items by a student (52) (a) Final grades. 
Cc feature set B + total # of participated weeks (53) a 
D feature set C + mean and standard deviation of = i 
weekly view count and weekly participation count g ca 
(57) s 92 SO 
E feature set D + each week’s view count and par- = 
ticipation count, and accumulated weekly view a ms = i io 
count and participation count (129) reature set 


4.3 Task 3: Predicting Individual Student’s Fi- 


nal Grade and Course Completion 

So far, experiments in Tasks 1 and 2 were conducted at the 
course levels, providing a macro perspective. Now we turn 
to building classifiers to predict individual student’s final 
grade and course completion (i-e., whether the student will 
complete the course or not) by using a data-driven approach, 
providing a micro perspective, and then evaluating effective- 
ness of the classifiers. In task 3, predicting a student’s final 
grade means predicting whether the student will belong to 
a high performance group (i.e., obtaining one of A, A-, B+, 
B and B-) or a low performance group (i.e., obtaining one 
of C+, C, C-, D+, D, F and W). 


4.3.1 Prediction in 2014 Fall Semester Dataset 

In this experiment, we used the 2014 Fall semester dataset 
consisting of 229 courses with 4,314,425 interactions and 
anonymized 10,003 student profiles. To build highly accu- 
rate classifiers, proposing and using features which have sig- 
nificant distinguishing power is important. To test this, the 
129 features listed in Table 3 were sampled to make five fea- 
ture sets entitled feature sets A, B, C, D and E as shown in 
Table 8. As we chose from feature set A to E, the number 
of features increased by including the previous features but 
also additional features. Feature sets A and B consisted of 
only static features, while feature sets C, D and E consisted 
of static features and temporal features. 


Since we didn’t know apriori which classification algorithm 
would perform the best, we chose 4 popular classification al- 
gorithms — SVM, Random Forest, J48 and AdaBoost. Given 
the 2014 Fall semester dataset, we did 10-fold cross-validation 
by dividing the dataset to 10 sub-samples. Each sub-sample 


(b) Course completion. 


Figure 2: Prediction results of SVM, Random For- 
est, J48 and AdaBoost based classifiers with five fea- 
ture sets. 


became a test set, the other 9 sub-samples became a train- 
ing set. We conducted a classification experiment for each 
of the 10 pairs of training and test sets. Then, we averaged 
the 10 classification results. We repeated this process for 
each classification algorithm. 


Figure 2 shows prediction results for final grades/performance 
groups and course completions. SVM based classifier out- 
performed Random Forest, J48 and AdaBoost based classi- 
fiers, achieving 80.95% accuracy, 0.79 F-measure and 0.72 
AUC in final grade prediction and 94.41% accuracy, 0.94 
F-measure and 0.85 AUC in course completion prediction. 
As we added more features (changing from feature set A to 
E), SVM classifier’s accuracy has increased in both predic- 
tions. Compared with the baseline, which was measured by 
a percent of the majority class instances and achieved 68% 
accuracy in final grade prediction and 84% in course com- 
pletion prediction, our SVM based classifier improved 19% 


(= 52:25 — 1) accuracy in final grades prediction, and 12.4% 
(= wt — 1) accuracy in course completion prediction. 


4.3.2 Robustness of Our Prediction Model 

In Section 4.3.1, we evaluated effectiveness of our classifi- 
cation approach for both final grades prediction and course 
completion prediction. Now we are interested in how much 
the pre-built model is robust when we apply it to data gen- 
erated in the future (i.e., future semesters). To simulate 
this scenario, we used the 2014 Fall semester dataset as a 
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(a) Final grades. (b) Course completion. 


Figure 3: Prediction results obtained by applying 
SVM-based classifiers trained by 2014 Fall dataset 
to 2015 Spring dataset. 
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(a) Final grades. (b) Course completion. 


Figure 4: Prediction results over time. 


training set and the 2015 Spring semester dataset as a test 
set (consisting of 221 courses with 6,262,293 interactions and 
anonymized 11,168 student profiles). We built a SVM-based 
classifier and predicted each student’s final grade and course 
completion in the test set. 


Figure 3 shows prediction results as we used feature set A to 
E. Again, using all the features (feature set EL) produced the 
best results, achieving 78.64% accuracy and 0.682 AUC in 
final grades prediction and 93.06% accuracy and 0.817 AUC 
in course completion prediction. Compared with the pre- 
vious experimental results in Section 4.3.1, there were only 
small reductions — 2.31% (final grades) and 1.35% (course 
completion). The experimental results confirmed that our 
proposed approach is robust and can be applied to future 
semesters. 


4.3.3 Early Prediction 

The previous experimental results showed that our approach 
was effective in predicting final grades and course comple- 
tion. In practice, it is better to produce prediction earlier 
so that a tool/system can automatically identify and alert 
which students are at risk of receiving a low grade or drop- 
ping out a course thereby requiring intervention by a teacher. 
To address this need, we used daily snapshot of data includ- 
ing student profiles, course information and interaction logs, 
and then simulated the scenario by building a SVM-based 
classifier in each week. In other words, we built a classifier 
and evaluated its performance in each week. By doing this, 
we examined how the classifier’s performance changed over 
time, and when we could achieve a reasonable accuracy. 


Figure 4 shows prediction results in the 2014 Fall dataset. In 
final grades prediction, when we built classifiers in the 7th 
week, 10th week and 15th week, we achieved 73.59%, 75.86% 
and 78.28% accuracy, respectively. Similarly, in course com- 
pletion prediction, we achieved 89.4% and 93.3% accuracy 
in 10th week and 16th week, respectively. Overall, adding 
more data improved performance of our classifiers. This 
study reveals that it is possible to detect students early who 
have a higher chance of receiving low grades or dropping out 


a course. 


5. CONCLUSIONS 


The purpose of this study was to explore relationships be- 
tween theoretically defined constructs extracted from a Learn- 
ing Management System and student learning outcomes. 
Three different tasks employing three different methods were 
used to explore these relationships. The first two tasks were 
conducted at the macro-level and thus aligned with a theory- 
driven approach, whereas the last task at the micro level 
aligned with a data-driven approach. 


Results from the cluster analysis revealed that courses with 
high inter-person (SS, ST) interaction had higher final grades 
and completion rates than courses in the other clusters (low- 
interaction and content-interaction), aligning with results 
from previous studies [6, 12]. Results also suggested that 
STEM and large courses tended to exhibit fewer of these pro- 
ductive interactions. The micro-level, data-driven machine 
learning analysis using prediction with SVM enabled the 
discovery of at-risk students with high accuracy. It achieved 
the best performance when all temporal features (complete 
feature set) were taken into consideration and was robust 
when predicting future data. 


In sum, for this dataset comprised of LMS interactions drawn 
from online undergraduate courses, the interaction frame- 
work was useful for interpreting at both macro and micro 
levels. 
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